41 research outputs found

    A Comparative Study on Deep Learning Models for Text Classification of Unstructured Medical Notes with Various Levels of Class Imbalance

    Get PDF
    Background Discharge medical notes written by physicians contain important information about the health condition of patients. Many deep learning algorithms have been successfully applied to extract important information from unstructured medical notes data that can entail subsequent actionable results in the medical domain. This study aims to explore the model performance of various deep learning algorithms in text classification tasks on medical notes with respect to different disease class imbalance scenarios. Methods In this study, we employed seven artificial intelligence models, a CNN (Convolutional Neural Network), a Transformer encoder, a pretrained BERT (Bidirectional Encoder Representations from Transformers), and four typical sequence neural networks models, namely, RNN (Recurrent Neural Network), GRU (Gated Recurrent Unit), LSTM (Long Short-Term Memory), and Bi-LSTM (Bi-directional Long Short-Term Memory) to classify the presence or absence of 16 disease conditions from patients’ discharge summary notes. We analyzed this question as a composition of 16 binary separate classification problems. The model performance of the seven models on each of the 16 datasets with various levels of imbalance between classes were compared in terms of AUC-ROC (Area Under the Curve of the Receiver Operating Characteristic), AUC-PR (Area Under the Curve of Precision and Recall), F1 Score, and Balanced Accuracy as well as the training time. The model performances were also compared in combination with different word embedding approaches (GloVe, BioWordVec, and no pre-trained word embeddings). Results The analyses of these 16 binary classification problems showed that the Transformer encoder model performs the best in nearly all scenarios. In addition, when the disease prevalence is close to or greater than 50%, the Convolutional Neural Network model achieved a comparable performance to the Transformer encoder, and its training time was 17.6% shorter than the second fastest model, 91.3% shorter than the Transformer encoder, and 94.7% shorter than the pre-trained BERT-Base model. The BioWordVec embeddings slightly improved the performance of the Bi-LSTM model in most disease prevalence scenarios, while the CNN model performed better without pre-trained word embeddings. In addition, the training time was significantly reduced with the GloVe embeddings for all models. Conclusions For classification tasks on medical notes, Transformer encoders are the best choice if the computation resource is not an issue. Otherwise, when the classes are relatively balanced, CNNs are a leading candidate because of their competitive performance and computational efficiency

    A Multicenter Mixed-Effects Model For Inference and Prediction of 72-h Return Visits to the Emergency Department for Adult Patients with Trauma-Related Diagnoses

    Get PDF
    Objective Emergency department (ED) return visits within 72 h may be a sign of poor quality of care and entail unnecessary use of healthcare resources. In this study, we compare the performance of two leading statistical and machine learning classification algorithms, and we use the best performing approach to identify novel risk factors of ED return visits. Methods We analyzed 3.2 million ED encounters with at least one diagnosis under “injury, poisoning and certain other consequences of external causes” and “external causes of morbidity.” These encounters included patients 18 years or older from across 128 emergency room facilities in the USA. For each encounter, we calculated the 72-h ED return status and retrieved 57 features from demographics, diagnoses, procedures, and medications administered during the process of administration of medical care. We implemented a mixed-effects model to assess the effects of the covariates while accounting for the hierarchical structure of the data. Additionally, we investigated the predictive accuracy of the extreme gradient boosting tree ensemble approach and compared the performance of the two methods. Results The mixed-effects model indicates that certain blunt force and non-blunt trauma inflates the risk of a return visit. Notably, patients with trauma to the head and patients with burns and corrosions have elevated risks. This is in addition to 11 other classes of both blunt force and non-blunt force traumas. In addition, prior healthcare resource utilization, patients who have had one or more prior return visits within the last 6 months, prior ED visits, and the number of hospitalizations within the 6 months are associated with increased risk of returning to the ED after discharge. On the one hand, the area under the receiver characteristic curve (AUROC) of the mixed-effects model was 0.710 (0.707, 0.712). On the other hand, the gradient boosting tree ensemble had a lower AUROC of 0.698 CI (0.696, 0.700) on the independent test model. Conclusions The proposed mixed-effects model achieved the highest known AUC and resulted in the identification of novel risk factors. The model outperformed one of the leading machine learning ensemble classifiers, the extreme gradient boosting tree in terms of model performance. The risk factors we identified can assist emergency departments to decrease the number of unplanned return visits within 72 h

    Obesity Heterogeneity by Neighborhood Context in a Largely Latinx Sample

    Get PDF
    Neighborhood socioeconomic context where Latinx children live may influence body weight status. Los Angeles County and Orange County of Southern California both are on the list of the top ten counties with the largest Latinx population in the USA. This heterogeneity allowed us to estimate differential impacts of neighborhood environment on children’s body mass index z-scores by race/ethnicity using novel methods and a rich data source. We geocoded pediatric electronic medical record data from a predominantly Latinx sample and characterized neighborhoods into unique residential contexts using latent profile modeling techniques. We estimated multilevel linear regression models that adjust for comorbid conditions and found that a child’s place of residence independently associates with higher body mass index z-scores. Interactions further reveal that Latinx children living in Middle-Class neighborhoods have higher BMI z-scores than Asian and Other Race children residing in the most disadvantaged communities. Our findings underscore the complex relationship between community racial/ethnic composition and neighborhood socioeconomic context on body weight status during childhood

    A Multivariable Model of Parent Satisfaction, Pain, and Opioid Administration in a Pediatric Emergency Department

    Get PDF
    Introduction: Children and adolescents are not impervious to the unprecedented epidemic of opioid misuse in the United States. In 2016 more than 88,000 adolescents between the ages of 12–17 reported misusing opioid medication, and evidence suggests that there has been a rise in opioid-related mortality for pediatric patients. A major source of prescribed opioids for the treatment of pain is the emergency department (ED). The current study sought to assess the complex relationship between opioid administration, pain severity, and parent satisfaction with children’s care in a pediatric ED. Methods: We examined data from a tertiary pediatric care facility. A health survey questionnaire was administered after ED discharge to capture the outcome of parental likelihood of providing a positive facility rating. We abstracted patient demographic, clinical, and top diagnostic information using electronic health records. Data were merged and multivariable models were constructed. Results: We collected data from 15,895 pediatric patients between the ages of 0–17 years (mean = 6.69; standard deviation = 5.19) and their parents. Approximately 786 (4.94%) patients were administered an opioid; 8212 (51.70%) were administered a non-opioid analgesic; and 3966 (24.95%) expressed clinically significant pain (pain score \u3e/= 4). Results of a multivariable regression analysis from these pediatric patients revealed a three-way interaction of age, pain severity, and opioid administration (odds ratio 1.022, 95% confidence interval, 1.006, 1.038, P = 0.007). Our findings suggest that opioid administration negatively impacted parent satisfaction of older adolescent patients in milder pain who were administered an opioid analgesic, but positively influenced the satisfaction scores of parents of younger children who were administered opioids. When pain levels were severe, the relationship between age and patient experience was not statistically significant. Conclusion: This investigation highlights the complexity of the relationship between opioid administration, pain severity, and satisfaction, and suggests that the impact of opioid administration on parent satisfaction is a function of the age of the child

    Optimal Multi-Stage Arrhythmia Classification Approach

    Get PDF
    Arrhythmia constitutes a problem with the rate or rhythm of the heartbeat, and an early diagnosis is essential for the timely inception of successful treatment. We have jointly optimized the entire multi-stage arrhythmia classification scheme based on 12-lead surface ECGs that attains the accuracy performance level of professional cardiologists. The new approach is comprised of a three-step noise reduction stage, a novel feature extraction method and an optimal classification model with finely tuned hyperparameters. We carried out an exhaustive study comparing thousands of competing classification algorithms that were trained on our proprietary, large and expertly labeled dataset consisting of 12-lead ECGs from 40,258 patients with four arrhythmia classes: atrial fibrillation, general supraventricular tachycardia, sinus bradycardia and sinus rhythm including sinus irregularity rhythm. Our results show that the optimal approach consisted of Low Band Pass filter, Robust LOESS, Non Local Means smoothing, a proprietary feature extraction method based on percentiles of the empirical distribution of ratios of interval lengths and magnitudes of peaks and valleys, and Extreme Gradient Boosting Tree classifier, achieved an F1-Score of 0.988 on patients without additional cardiac conditions. The same noise reduction and feature extraction methods combined with Gradient Boosting Tree classifier achieved an F1-Score of 0.97 on patients with additional cardiac conditions. Our method achieved the highest classification accuracy (average 10-fold cross-validation F1-Score of 0.992) using an external validation data, MIT-BIH arrhythmia database. The proposed optimal multi-stage arrhythmia classification approach can dramatically benefit automatic ECG data analysis by providing cardiologist level accuracy and robust compatibility with various ECG data sources

    A High-Precision Machine Learning Algorithm to Classify Left and Right Outflow Tract Ventricular Tachycardia

    Get PDF
    Introduction: Multiple algorithms based on 12-lead ECG measurements have been proposed to identify the right ventricular outflow tract (RVOT) and left ventricular outflow tract (LVOT) locations from which ventricular tachycardia (VT) and frequent premature ventricular complex (PVC) originate. However, a clinical-grade machine learning algorithm that automatically analyzes characteristics of 12-lead ECGs and predicts RVOT or LVOT origins of VT and PVC is not currently available. The effective ablation sites of RVOT and LVOT, confirmed by a successful ablation procedure, provide evidence to create RVOT and LVOT labels for the machine learning model. Methods: We randomly sampled training, validation, and testing data sets from 420 patients who underwent successful catheter ablation (CA) to treat VT or PVC, containing 340 (81%), 38 (9%), and 42 (10%) patients, respectively. We iteratively trained a machine learning algorithm supplied with 1,600,800 features extracted via our proprietary algorithm from 12-lead ECGs of the patients in the training cohort. The area under the curve (AUC) of the receiver operating characteristic curve was calculated from the internal validation data set to choose an optimal discretization cutoff threshold. Results: The proposed approach attained the following performance: accuracy (ACC) of 97.62 (87.44–99.99), weighted F1-score of 98.46 (90–100), AUC of 98.99 (96.89–100), sensitivity (SE) of 96.97 (82.54–99.89), and specificity (SP) of 100 (62.97–100). Conclusions: The proposed multistage diagnostic scheme attained clinical-grade precision of prediction for LVOT and RVOT locations of VT origin with fewer applicability restrictions than prior studies
    corecore